Evaluating comprehension of natural and synthetic conversational speech

نویسندگان

Mirjam Wester

Oliver Watts

Gustav Eje Henter

چکیده

Current speech synthesis methods typically operate on isolated sentences and lack convincing prosody when generating longer segments of speech. Similarly, prevailing TTS evaluation paradigms, such as intelligibility (transcription word error rate) or MOS, only score sentences in isolation, even though overall comprehension arguably is more important for speech-based communication. In an effort to develop more ecologicallyrelevant evaluation techniques that go beyond isolated sentences, we investigated comprehension of natural and synthetic speech dialogues. Specifically, we tested listener comprehension on long segments of spontaneous and engaging conversational speech (three 10-minute radio interviews of comedians). Interviews were reproduced either as natural speech, synthesised from carefully prepared transcripts, or synthesised using durations from forced-alignment against the natural speech, all in a balanced design. Comprehension was measured using multiple choice questions. A significant difference was measured between the comprehension/retention of natural speech (74% correct responses) and synthetic speech with forced-aligned durations (61% correct responses). However, no significant difference was observed between natural and regular synthetic speech (70% correct responses). Effective evaluation of comprehension remains elusive.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Synthesis and evaluation of conversational characteristics in speech synthesis

Conventional synthetic voices can synthesise neutral read aloud speech well. But, to make synthetic speech more suitable for a wider range of applications, the voices need to express more than just the word identity. We need to develop voices that can partake in a conversation and express, e.g. agreement, disagreement, hesitation, in a natural and believable manner. In speech synthesis there ar...

متن کامل

Comprehension of KTH text-to-speech with "listening speed" paradigm

The comprehension of natural and synthetic speech in Swedish and American English was investigated using a sentence-by-sentence listening paradigm. The synthesized speech was generated by the KTH text-to-speech systems. Results indicated that sentence listening times were signzficantly longer only for American English synthetic speech as compared to natural speech. Text dijjjculty was found to ...

متن کامل

Comprehension of synthesized speech while driving and in the lab

Two studies were conducted to measure the comprehensibility of synthetic speech with current text-tospeech technology. Baseline measurements for each subject were obtained using recorded natural speech. The first study was conducted in a quiet lab with no distractions. Half the subjects were allowed to take notes while listening, the other half were not. Findings showed that there was no signif...

متن کامل

Perception and Comprehension of Synthetic Speech

An extensive body of research on the perception of synthetic speech carried out over the past 30 years has established that listeners have much more difficulty perceiving synthetic speech than natural speech. Differences in perceptual processing have been found in a variety of behavioral tasks, including assessments of segmental intelligibility, word recall, lexical decision, sentence transcrip...

متن کامل

Comprehension of Speech Presented at Synthetically Accelerated Rates: Evaluating Training and Practice Effects

The ability to monitor multiple sources of concurrent auditory information is an integral component of Navy watchstanding operations. However, this leads to attentionally demanding environments. The present study tested the utility of a potential solution to listening to multiple speech communications in an auditory display environment: presenting speech serially at synthetically accelerated ra...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Evaluating comprehension of natural and synthetic conversational speech

نویسندگان

چکیده

منابع مشابه

Synthesis and evaluation of conversational characteristics in speech synthesis

Comprehension of KTH text-to-speech with "listening speed" paradigm

Comprehension of synthesized speech while driving and in the lab

Perception and Comprehension of Synthetic Speech

Comprehension of Speech Presented at Synthetically Accelerated Rates: Evaluating Training and Practice Effects

عنوان ژورنال:

اشتراک گذاری